\subsection{Parachains}\label{sec:parachains}
In this section we review parachain's block production, their availability and validity scheme, and their messaging scheme.
\subsubsection{Block Production}\label{sec:parachainblockproduction}

We will discuss block production for a general parachain. At the end of the section, we will discuss alternatives.

In outline, a collator produces a parachain block, sends it to the parachain validators,
who sign its header as valid, and the header with enough signatures is placed on the relay chain.
At this point, the parachain block is as canonical as the relay chain block its header appeared in.
If this relay chain block is in the best chain according to BABE (see Section \ref{sec:babe}), so is the parachain block and when this relay chain block is finalised, so is the parachain block.

Because the parachain validators switch parachains frequently, they are stateless clients of the parachain.
Thus we distinguish between the parachain block $B$, which is normally enough for full nodes of the parachain such as collators to update the parachain state,
and the {\em Proof of Validity(PoV)} block $B_{PoV}$, which a validator who does not have the parachain state can verify.

Any validator should be able to verify $B_{PoV}$ given the relay chain state using the parachain's {\em state transition validation function} (STVF),
the Web assembly code for which is stored on the relay chain in a similar way to the relay chain's runtime.
The STVF takes as an input the PoV block, the header of the last parachain block from this parachain and a small amount of data from the relay chain state.
	
The STVF outputs the validity of the block, the header of this block and its outgoing messages.
The PoV block contains any outgoing messages and the parachain block $B$. The parachain validators should gossip the parachain block to the parachain network, as a back up to the collator itself doing so.

The PoV block will be the parachain block, its outgoing messages, its header and light client proof witnesses. These witnesses are Merkle proofs that give all elements of the input and output state that are used or modified by the state transition from the input and output state roots.


To aid in censorship resistance, a parachain may want to use proof of work or proof of stake to select collators, where the selection strategy is up to the given parachain.
This can be implemented in the STVF and need not be a part of the Polkadot protocol. So for proof of work,
the STVF would check that the hash of the block is sufficiently small.
However, for speed, it would be useful to ensure that most relay chain blocks can include a parachain block.
For PoW, this would necessitate it being probable that multiple collators are allowed to produce a block.
As such we will still need a tie-breaker for the parachain validators to coordinate on validating the same parachain block first.
This may be the golden ticket scheme of \cite{2016:Wood:Polkadot}. For proof of stake this may not be necessary.

Optionally, for some parachains, the parachain block $B$ may not be enough for collators to update their state. This may happen for chains that use succinct zero-knowledge proofs to update their state, or even for permissioned chains that just give signatures from authorities for validity. Such chains may have another notion of parachain block which is actually needed to update their state and must have their own scheme to guarantee the availability of this data.





\subsubsection{Validity and Availability} \label{sec:validity-and-availability}
Once a parachain block is created it is important that the {\em parachain blob} consisting of the PoV block and set of outgoing messages from the parachain is available for a while.
The naive solution for this would be broadcasting/gossip the parachain blobs to all relay chain nodes, which is not a feasible option because there are many parachains and the PoV blocks may be big. We want to find an efficient solution to ensure PoV blocks from any recently created parachain blocks are available.

For a single chain, such as Bitcoin, as long as 51\% of hash power is honest, not making block data available ensures that no honest miner builds on it so it will not be in the final chain. However, parachain consensus in Polkadot is determined by relay chain consensus.
A parachain block is canonical when its header is in the relay chain.
We have no guarantees that anyone other than the collator and parachain validators have seen the PoV block.
If these collude then the rest of the parachain network need not have the parachain block and then most collators cannot build a new block and this block's invalidity may not be discovered. We would like the consensus participants, here the validators, to collectively guarantee the availability rather than relying on a few nodes.

To this end we designed an availability scheme that uses erasure coding (see e.g. \cite{availabilityETH2}) to distribute the PoV block to all validators.
When any misbehaviour, particularly in relation to invalidity, is detected, the blob can be reconstructed from the distributed erasure coded pieces.

If a block is available then full nodes of the parachain, and any light client that has the PoV block, can check its validity. We have three-level of validity checks in Polkadot. The first validity check of a PoV block is executed by the corresponding parachain validators. If they verify the PoV block then they sign and distribute the erasure codes of the blob, including the PoV block, to each validator.
We rely on nodes acting as fishermen to report the invalidity of a blob as  a second level of validity checking. They would need to back any claim with their own stake in DOTs. We would assume that most collators will be fishermen, as they have a stake in continued validity of the chain and are already running full nodes, so all they need is stake in DOTs. The third level of validity checking is executed by a few randomly and privately assigned validators. We determine the number of validators in the third level of validity checking considering the amount of invalidity reports given by fishermen and unavailability reports given by  collators. If an invalid parachain block is detected, the validators who signed for its validity are slashed.
We wait for enough of these randomly assigned checkers to check the block before voting on it in GRANDPA. We also  want to ensure that the block is available before selecting the randomly assigned validators. This means that the parachain validators have to commit running a high risk of being slashed for a small probability of getting an invalid block finalised. This means that the expected cost of getting an invalid block into Polkadot is higher than the amount of stake backing a single parachain.

The security of our availability and validity scheme is based on the security of the GRANDPA finality gadget (see Section \ref{sec:grandpa}) and the quality of randomness generated in each BABE epoch (see Section \ref{sec:babe}). Please see \cite{availandvalid} for more details about the availability and validity scheme.


\subsubsection{Cross Chain Messaging Passing (XCMP)} \label{sec:XCMP}
XCMP is the protocol that parachains use to send messages to each other.
It aims to guarantee four things: first that messages arrive quickly; second that messages from one parachain arrive to another in order; third that arriving messages were indeed sent in the finalised history of the sending chain; and fourth that recipients will receive messages fairly across senders, helping guarantee that senders never wait indefinitely for their messages to be seen.



There are two parts to XCMP. (1) Metadata about outgoing messages for a parachain block are included on the relay chain and later this metadata is used to authenticate messages by the receiving parachain. (2) The message bodies corresponding to this metadata need to be actually distributed from the senders to the recipients, together with a proof that the message body is actually associated with the relevant metadata. The details of distribution are covered as a networking protocol in \nameref{sec:net_crosschain}; the remainder is covered below.

The way relay chain blocks include headers of parachain blocks gives a synchronous notion of time for parachain blocks, just by relay chain block numbers. Additionally it allows us to authenticate messages as being sent in the history given by the relay chain i.e. it is impossible that one parachain sends a message, then reorgs \footnote{reorganisation of the chain} so that that message was not sent, but has been received. This holds even though the system may not have reached finality over whether the message was sent, because any relay chain provides a consistent history.

Because we require parachains to act on every message eventually, non-delivery of a single message can potentially stop a parachain from being able to build blocks. Consequently we need enough redundancy in our message delivery system. Any validators who validate the PoV block should keep any outgoing messages from that block available for a day or so and all full nodes of the sending parachain also store the outgoing messages until they know they have been acted on.

To achieve consistency, when a source parachain $S$ sends messages in a parachain block $B$ to a destination parachain $D$, then we will need to authenticate these using the relay chain state, which is updated based on the parachain header $PH$ corresponding to the parachain block $B$ that was included in the relay chain. We need to limit the amount of data in headers like $PH$, in the relay chain state and also to limit what the relay chain needs to do for authentication when processing such parachain headers.

To this end, the parachain header $PH$ contains a message root $M$ of outgoing messages, as well as a bitfield indicating which other parachains were sent messages in this block.
The message root $M$ is the root of a Merkle tree of the head hash $H_p$ of a hash chain for each parachain $p$ that this block sends messages to.  The hash chain with head $H_D$ has the hash of all messages sent to $S$ from $D$, not just in block $B$ but ever sent from $S$ to $D$ in any block up to block $B$. This allows many messages from $S$ to $D$  to be verified at once from $M$. However the messages themselves are passed, they should also be sent with the Merkle proof that allows nodes of the receiving parachain
to authenticate that they were sent by a $B$ whose header  $PH$ was in a particular relay chain block.

Parachains receive incoming messages in  order. Internally parachains may defer or reorder acting on messages according to their own logic (possibly constrained by SPREE, see \ref{sec:SPREE}). However they must receive messages in the order determined by the consistent history given by the relay chain.
A parachain $D$ always receives messages sent by parachain blocks whose header was in earlier relay chain blocks first. When several such source parachains have a header in the relay chain block, the messages from these parachains are received in some predetermined order of parachains, either sequentially in order of increasing parachain id, or some shuffled version of this.

A parachain $ D  $ receives all messages sent by one parachain $ S $ in one parachain block or none of them.
A parachain header $PH'$ of $D$ contains a watermark. This watermark consists of a block number of a relay chain block $R$ and a parachain id of a source parachain $S$. This indicates that $D$ has recieved all messages sent by all chains before relay chain block $R$, and has acted on messages sent in block $R$ from parachains up to and including $S$ in the ordering.



The watermark must advance by at least one sending parachain in each of $ D $’s parachain blocks, which means that the watermark's relay chain block number advances or it stays the same and we only advance the parachain. To produce a parachain block on parachain $D$ which builds on a particular relay chain block $R$, a collator would need to look at which parachain headers were built between the relay chain block that the last parachain block of this chain built on. In addition, it needs the corresponding message data for each of those that indicated that they sent messages to $D$.
Thus it can construct a PoV block so that the STVF can validate that all such messages were acted on. Since a parachain must accept all messages that are sent to it,
we implement a method for parachains to make it illegal for another parachain to send it any messages that can be used in the case of spam occurring. When the parachain header of a parachain block that sends a message is included in a relay chain block, then any nodes connected to both the source and destination parachain networks should forward messages, together with their proofs, from sender to receiver.
 The relay chain should at least act as a back up: the receiving parachain validators  of $D$ are connected to $D$'s parachain network and if they do not receive messages on it, then they can ask for them from the parachain validators of the sending chain $S$ at the time the message was sent.

